Dynamic Branch Decoupled Architecture
نویسندگان
چکیده
We propose an alternative approach to branch resolution based on the earlier work on decoupled memory architectures. Branch decoupling is a technique to decouple a single instruction stream program into two streams. One stream is solely dedicated to resolving branches as early as possible (both the branch condition and the branch target). The resolved branch targets are consumed by the other computing stream through a queue. We have proposed a compiler based, static branch decoupling methodology earlier. In this paper, we propose a dynamic branch decoupled (DBD) architecture. Simulations show a speedup of for SPEC95 integer benchmarks and for SPEC95 FP benchmarks over a 2-level adaptive branch predictor. The average number of branch penalty cycles per instruction for DBD reduces to compared to for the 2-level
منابع مشابه
A Decoupled Fetch-Execute Engine with Static Branch Prediction Support
We describe a method for supporting static branch prediction on a decoupled fetch-execute pipeline. Using instruction buffers to decouple instruction fetch from the execute pipeline is an effective way to minimize instruction cache penalties by allowing instruction fetch and stall miss handling to proceed independent of the execution pipeline. Dynamic branch prediction is typically used with su...
متن کاملAn Integrated Partitioning and Scheduling Based Branch Decoupling
Conditional branch induced control hazards cause significant performance loss in modern out-of-order superscalar processors. Dynamic branch prediction techniques help alleviate the penalties associated with conditional branch instructions. However, branches still constitute one of the main hurdles towards achieving higher ILP. Dynamic branch prediction relies on the temporal locality of and spa...
متن کاملTolerating Branch Predictor Latency on SMT
Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with...
متن کاملSurvey the Security Function of Integration of vehicular ad hoc Networks with Software-defiend Networks
In recent years, Vehicular Ad Hoc Networks (VANETs) have emerged as one of the most active areas in the field of technology to provide a wide range of services, including road safety, passenger's safety, amusement facilities for passengers and emergency facilities. Due to the lack of flexibility, complexity and high dynamic network topology, the development and management of current Vehicular A...
متن کاملCharacterization of embedded applications for decoupled processor architecture
Needs for performance on embedded applications will lead to the use of dynamic execution on embedded processors in the next few years. However, complete out-of-order superscalar cores are still expensive in terms of silicon area and power dissipation. In this paper, we study the adequation of a more limited form of dynamic execution, namely decoupled architecture, to embedded applications. Deco...
متن کامل